perm filename EXPERT[AM,DBL] blob
sn#600184 filedate 1981-07-17 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00013 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 Draft of RLL section of Building Expert Systems
C00006 00003 2.4.1.1 Overview of RLL
C00014 00004 2.4.1.2 Knowledge Representation in RLL
C00027 00005 2.4.1.2.1 Example - How RLL Encodes Rules
C00043 00006 2.4.1.2.2 RLL's Modifiability
C00049 00007 2.4.1.3 Control
C00053 00008 2.4.1.3.1 Overview of Agenda Mechanism
C00057 00009 2.4.1.3.2 The Details of the Agenda Mechanism
C00065 00010 2.4.1.3.3 Efficiency
C00068 00011 2.4.1.4 System Facilities
C00075 00012 2.4.1.5 Conclusion
C00083 00013 BIBLIOGRAPHY
C00084 ENDMK
C⊗;
Draft of RLL section of Building Expert Systems
2.4.1 RLL
Expert systems evolved to help (human) domain experts manage both the bulk
and the complexities associated with knowledge-intensive tasks. One
interesting and important observation is that the task of building expert
systems is itself a suitable one to make the target of an expert system.
This realization triggered the development of RLL, a representation
language language.
RLL is, first, a collection of tools which help the knowledge engineer
(KE) construct, use, and adapt expert system programs. In addition, it is
itself a bona-fide expert system -- knowledgable in facts about
programming in general, and its own subroutines in particular. This
competency permits RLL to "understand" its internal inference procedures,
and provides the user with a mechanism -- simple editing of the
descriptions of those subroutines -- for modifying the RLL environment to
suit his current task. All too often, KEs have had to twist the facts,
rules, relations, etc. of a task in order to fit it into some rigid
representation scheme.
This section will quickly overview this RLL system, taking examples from
our experience coding up OSS -- the Oil Spill System. We hope to show the
gains from self-describing and self-modifying software, as well as
presenting techniques for recapturing any of the efficiency lost by going
down this route.
2.4.1.1 Overview of RLL
The hallmark of an expert system is competence in some task domain.
Usually, this does not extend to comptence in meta-cognition, an ability
for a program to reason about its own structure and behavior. MYCIN,
e.g., doesn't "know" that it uses a backward chaining inference engine to
perform its deductions, let alone WHY it uses such a control structure.
AGE, e.g., cannot answer questions about the efficiencies of itself or of
the programs it write, nor can it modify its basic Blackboard model to
handle, say, a different type of rule.
RLL begins, as does each KE himslef, with a library of useful, common
types of slots (IsA, Inverse, RangeType, and several hundred others)
control mechanisms, (backward chaining, agendae, and a few others) and
inheritance schemes. Each of these is represented internally in RLL as a
unit, a frame-like data structure of attributes (slots) and values. Thus,
each kind of slot, each kind of control scheme, each inheritance mode is
described within RLL's formalisms. Some of the units are standard
mechanisms for modifying "domain knowledge", and they can now serve to
modify any part of RLL itself. One dangerous consequence of this
philosophy is that they can modify themselves, perhaps destroying the
ability of the system thenceforth to be self-modifying.
At any moment, some ensemble of the units are combined into a description
of the desired current system, and RLL behaves in the prescribed manner.
That ensemble defines a representation language which the KE can "freeze"
if he desires, and use for some particular application he must build.
This is the reason we call RLL a representation language language. No
single set of starting primitives could possibly satisfy all of the users
all of the time.
To help the KE tailor the language to his current task, RLL provides a
number of tools (all represented as RLL units, of course) which the user
can use in constructing his own pieces, and in combining pieces into
ensembles. For example, the user can easily create a new type of slot or
inheritance procedure. Typically, he would copy a similar existing unit
and edit that copy. RLL then "compiles" the changed description into
executable code.
RLL's competence model of programming comes in at this point. There are
many indirect ramifications which arise when, say, a new laboratory
technique is added to Molgen's data base; similarly there are a host of
actions which must be taken when some part of the underlying
representation is changed. Just as Molgen's code is responsible for
propogating the effects of this new data, so the overall RLL system will
find the parts of the system which may be affected by this change, and
update these pieces appropriately.
RLL allows the specifics of a problem to guide the nature and
implementation of the user's final ES -- rather than force the system into
some particular formalism.
The reaminder of this section fleshes out our so-far skeletal description
of RLL. Subsection 2.4.1.2 explains how facts are represented in RLL ,
and shows how this structure is adequate to handle knowledge about both
the particular domain (eg Oil Spills) and RLL internals themselves (eg
facts about slots or control mechanisms). This section includes a
concrete example, using RLL's management of rules to show first how the
facts are stored, and then how various type of modifications can be done
-- ie how RLL has attained this degree of flexibility. Section 2.4.1.3
describes the control structure used for this partcular OSS application,
in a fair amount of detail. This section also explains how RLL can
provide the self-describing and self-modifying abilities discussed
previously, and yet retain (recapture) almost all the efficiency of a
hand-tailored system. The final two sections describe how RLL fits into
the world, and what yet remains to be done.
2.4.1.2 Knowledge Representation in RLL
RLL is based on the "To every thing, a unit" philosophy. This uniform
representation system extends to all domain knowledge (e.g., units for
each type of oil and each pipe junction), and also to all repreentation
knowledge (e.g., units for the Viscosity type of slot and for the
BackChain control regime).
*--------*--------*--------*--------*
M6-2
Isa: (AnyManhole)
FeedsFrom: (M6-3 M6-2)
FeedsInto: M6-9
AnyManhole
Isa: (AnyClassOfPhysObjects)
Examples: (M6-2, M6-3, ...)
Description: The class of all manholes.
Isa
Isa: (AnySlot)
Inverse: (Examples)
Description: The slot signifying membership in a class.
MakesSenseFor: (Anything)
FeedsFrom
Isa: (AnySlot OilSpillConcept)
Inverse: FeedsInto
Description: The slot naming the upstream manholes.
MakesSenseFor: (AnyManhole)
*--------*--------*--------*--------*
RLL is, for better or worse, committed to using a particular
representation at its heart, though the apparent one seen by the user may
easily be changed, and in principle the innermost RLL might be as well.
That representation comprises a semantic network, with each node being a
structured frame-like "unit" whose attributes (slots) are defined with a
set of property-value pairs. We will show below the great flexibility RLL
does provide, within this constraint. We use the notation Oil:Isa to
refer to the value of the Isa slot of the unit Oil.
The first unit in the box above refers to a specific objects which exists
in the real world -- M6-2 represents manhole #2 on waterline #6. The next
unit represents a class of physical objects; AnyManhole encodes facts
about the set of all manholes. The next two units represent kinds of
slots. Of course there are dozens of slots absent from the units in the
figure (e.g., Worth of M6-2, Cardinality of AnyManhole, Format of
FeedsFrom).
Some types of units (e.g., those representing events) can have varying
degrees of generality. Consider an example: We may want to say general
things about oil spilling into a body of water -- eg, that it causes a
sheen. Ideally all such facts should be stored in one place, say the
OilSpillingIntoWater unit, in such a manner that every more specialized
case, will automatically inherit these facts. We want RLL to conclude
that oil spilling into a stream will cause a sheen in that stream once it
knows that OilSpillingIntoStream is a specialization of
OilSpillingIntoWater. We could go on to describe facts about
OilFromPipe93SpillingIntoWOC, a unit which, in turn, can be further
restricted into MachineOilFromPipe93SpillingIntoWOCAfterOutFall93, which
can be restricted to
DayOldMachineOil#2FromPipe93'sFirstOutletSpillingIntoWOC-
AfterOutFall93onAug23. Clearly we do NOT want to spawn a new unit for
every point in n-dimensional space. A new unit is never produced
arbitrarily, but rather only when there is some pertinant fact to preserve
about the event. When the fact is general, it goes on a general event
unit, which enables its descendants to inherit these facts.
The mechanism for handling this type of inheritance is distinct from the
more standard ElementOf or SubsetOf relations, and should be treated as
such. RLL insures this seperation by allocating different units to store
facts about each type of inheritance. Various properties of these
inheritance-defining units indicate how to initialize a new unit (as
stating that Pipe#33 is an Example of AnyPipe should be handled quite
differently from the statement that OilSpillingIntoStream is a
Specialization of OilSpillingIntoWater), or how the values of some slot
may be altered, to maintain this particular inheritance relation. (eg the
SubstanceSpilt of any Specialization, S, of OilSpillingIntoWater must be
some refinement of Oil, whereas S:TimeOfSpill can be arbitrary.
Analogously Pipe#33 can have slots like Length or TimeOfCreation, but not
things like Father or RangeType.)
Exactly how such data is encoded is far too detailed (and irrelevant) for
this report; the important fact is that such facts can be stated precisely
and explicitly in RLL. Furthermore, such facts can be altered by the
user, if his task demands it. Indeed, the user can even create new
"representational pieces": As none of RLL's prior tasks had needed this
Specialization inheritance, our starting system did not have this
particular feature. While doing the oil spill task, we designed this
totally new mechanism, and incorporated it into RLL. This is an example
of a kind of change which takes minutes in RLL, and is impossible or
awkward for most other languages. Note that this feature is now a part of
RLL -- subsequent users can now use this type of inheritance, as easily as
any of the other ones RLL "always" provided. We envision RLL will be
constantly growing -- largely through additions provided by its users.
Slots are another type of representational piece. As with inheritance
modes, the information relevant to each type of slot is stored in a unit.
The unit below shows that: The value of x:FeedsInto must be a list of
manholes, the only x for which x:FeedsInto is defined is a manhole, and
that if x:FeedsInto = y, then y:FeedsFrom = x (i.e. FeedsInto:Inverse =
FeedsFrom).
*--------*--------*--------*--------*
FeedsInto
Isa: (AnySlot)
Description: The slot naming the downstream manholes.
Inverse: FeedsFrom
HighLevelDefn: (Composition OtherEnd ConnectedPipes)
MakesSenseFor: (AnyManhole)
Format: SingleElements
Datatype: Unit representing a manhole.
ToCompute: (λ (u) [Find the pipe, P, to which this manhole connects.
See which pipe, p', to which this P is connected.
Return the manhole m, which is connected to
pipe p'.]
Note - the ToCompute of a slot indicates how to deduce its value -- i.e.
S:ToCompute is a function, F, whose value, (F u), is the value to fill
u:S.
*--------*--------*--------*--------*
It is important to realize that those facts shown above are not JUST a
description of the FeedsInto type of slot -- it really is used to define
this slot. If any of those slots were ever changed, the way this slot
behaves would be affected. For example, resetting FeedsInto:Format to be
SetOfElements (rather than SingleElement) would cause RLL to redefine
FeedsInto:ToCompute to return a singleton list rather than a single atom
(corresponding to that manhole,) and to (retroactively) fix up each value
of u:FeedsInto. If that value were empty, it would remain empty.
Otherwise the single value v would be changed into the list (v).
Similarly, each data structure -- each rule (condition/action heuristic),
task (small local goal), and agenda (list of tasks dealing with a single
topic) -- is stored as a unit. This permits the data to dynamically
altered during the course of the computation, or through user input. That
is, one uses the same procedures to add in a new unit, independent of
whether that unit represents a new rule, a new pipe, or new type of slot.
Even the procedures of this system are encoded in units -- notalby
including the parts of the control structure outlined below. As with
parts of the repesentation, these facts are actually used as the system is
run. Editing the units describing them will result in changes in the flow
of control of RLL. For brevity, we will address details of only one of
these chunks -- the rules. Section 2.4.1.4, which overviews the contol
structure actually used, will then sketch the other parts of the system's
control, emphasizing the advantages which come from "unitizing" these
parts.
2.4.1.2.1 Example - How RLL Encodes Rules
Each rule has a number of obvious attributes. The primary facet of each
rule is the actual code which is to be executed. RLL spares the user the
arduous task of entering the actual LISP expression to be run; instead the
user can type a more laconic high-level specification of what this rule is
to do. RLL then "expands" this definition into that executable code.
(This is done in a very general, RLL-ish manner, using facts stored in the
name of the slot. Hence it is the unit associated with the ThenTellUser
slot which "knows" to expand a message, like
("Do not breath this chemical, " Chemical "!!"
" It is " (Toxicity Chemical) ".")
into code which prints it when the user is present --
(PROGN (WriteToUser "Do not breath this chemical, " (GetVal Chemical) "!!")
(WriteToUser " It is " (GetVal (Toxicity Chemical)) "."))
)
*--------*--------*--------*--------*
Rule#332
Isa: (AnyRule)
Description: Tell the user to hold his breath if the chemical is toxic.
IfPotentallyRelevant: (APPLY ToxicP Chemical)
IfTrulyRelevant: (APPLY NearbyUser ChemicalHasSeepedTo)
ThenTellUser: ("Do not breath this chemical, " Chemical "!!"
" It is " (Toxicity Chemical) ".")
ThenAddToAgenda: EmergencyProcedures
Priority: High
OnTask: (ImminentDanger)
ImminentDanger
Description: This task tries to find if the current situation is
dangerous; and if so, suggest solutions/fixes/alternatives.
Isa: (AnyTask)
RuleList: (Rule#332, ...)
EmergencyProcedures
Description: This task is called iff the current situation is dangerous;
and issues many quick-and-dirty orders.
Isa: (AnyTask)
RuleList: (Rule#981, ...)
- Note the rule above is what the user would type.
RLL would use this to compute the actual to-be-run code, as it was needed.
*--------*--------*--------*--------*
There are several major advantages to providing both high and low level
forms of a rule specification. In addition to facilitating the creation
of a new rules, this higher level makes the nature of the rule easier to
understand and reason about, for both the user and RLL itself. RLL is
able to use different versions of the definition of the procedure or slot
for different type of tasks, in much the same manner that InterLisp
handles both compiled and interpreted forms of each function. When a user
wishes to examine or edit a function, IL pulls in that function's source
listing, rather than force the user to peruse LAP code. However it is
that low-level LAP code which IL actually runs. Similarly when someone
wants to examine a rule, (ie estimate how long it will take, or determine
what sorts of things this execution will affect,) the rule's concise
"declarative" form of the specification is offered. This, of course, is
NOT the form which is actually run. The much-less-perspicious expanded
form is used for this purpose. This also represents a sizable savings of
run-time speed. Section 2.4.1.5.3 addresses this, and related, issues.]
The example above shows how the specifications for this rule are scattered
about, stored in several distinct slots. This permits different parts to
be executed independently -- for example, allowing the quick IfPotentially
check to run to all of wide class of rules; and then running the more
expensive IfTrulyRelevant part only on that small subset which passed.
The THEN parts are also split up. In common practice, once the IF-parts
have all passed, each of the THEN parts is run, in the appropriate order.
The description above is the way a rule is evaluated "by default" -- that
is, unless the user has something else in mind. Recall any user can
readily modify that rule-evaluating procedure as well. So, for example,
the user may decide instead to execute the code stored on the
IfWorkingOnTask slot of a rule, or IfTodayIs, or any other set of slots.
rather than the current IfPotentiallyRelevant and IfTrulyRelevant slots
now used. In getting EURISKO running, Lenat had to define three additional
rule interpreters of this sort, though most of them refer to the same
kindsof If- and Then- slots of rules.
The user may also decide to compose a single Total-IF-Part, which ANDs
together all of the different types of If-parts. Similarly he may want
the THEN parts executed in a different order; or may have decided to only
run those THEN parts for which he has sufficient resources.
Note that this could not be done if each rule were simply a single piece
of compiled code, or only a double lump of code (IF and THEN). We claimed
above that it was easy to change the rule interpreter. This is because
that block of code is itself an RLL unit. Like the rules, it is
decomposed into nice sized chunks, which can be independently modified.
One such hook is a list of slotnames which constitute the IF part of a
rule, together with a description of when that code should be executed.
We will return to this point in the next section, which discusses the
basic control structure.
*--------*--------*--------*--------*
StandardRuleInterpreter
Isa: (AnyRuleInterpreter)
Description: This is the standard procedure used to interpret a given
rule. It first executes the AND-junction of the rule's
If-Parts. If the result is nonNIL, it executes all of
rule's Then-Parts.
HighLevelDefn: (IF (AND-Junction ProcessPart IfParts)
(AND-Junction ProcessPart ThenParts))
UsedByRules: (Rule#3 Rule#5 ...)
IfParts
Isa: (AnySlot)
Description: The value of this slot is the list of slots of the unit
(here a rule) which descend from the class, AnyIfPart.
These are ordered by the value of their respective
OrderForRuleParts.
HighLevelDefn: (PutInOrder (Subsetting MySlots
(ValueIncludes Isa AnyIfPart))
OrderForRuleParts)
MakesSenseFor: Rules
Datatype: Slots
Format: OrderedSetOfElements
Simplified version of the Standard Rule Interpreter.
Much of its code is "hidden" in the parts of the HighLevelDefn -- eg
IF, AND-Junction, ... Of course, these too are units, and are visible to the
interested user.
(The IfParts unit is shown, to dissuade those who do not believe this claim.)
*--------*--------*--------*--------*
Before leaving this rule description, there are several other
"declarative" facts which might be stored in each rule. One important
slot is the average CPU time this rule has spent, for executing its IF
parts and THEN parts, respectively. This can be used for various
meta-level reasoning tasks -- such as deciding whether to even consider
this rule in the future (ie did its bang justify its bucks).
Each rule also records, in a slot, the name of its author or creator,
which might be a person and might be another rule (rules can inspect,
modify, and synthesize other rules). This can help decide who to
credit/blame for good/bad rules; and this data can help the overall system
to improve its performance by relying more and more on capable (that is,
successful) rule writers. Finally, the user himself can store any other
type of information he wants -- these are just things which we feel may be
useful types of statistics.
Of course various inference schemes -- most importantly an inheritance
hierarchy -- are used to simplify data entry. A new rule can be entered
by merely copying that existing prototype, and changing only those entries
which are inappropriate. Hence the bulk of the properties are simply
inherited from more general ideas -- here from RLL's facts about rules in
general. *
[Fnnote*: In fact, for pedagogic reasons, we might make the novice user
master only a small subset of the slots initially present in RLL. He need
learn about other dimensions of RLL's extendability only when he begins to
complain about how awkward it is to encode this or that fact, or wonders
why he can't seem to capture some particular type of idea. Until that
time, for example, he need have nothing more that a "behaviour
description" of the rule interpreter -- which is all he would ever know
about the corresponding evaluator in most other expert systems.]
The above description of rules illustrates the degree of flexibility
available to the user at the designing stage. The current RLL includes
many rule interpreters which we, the RLL designers, felt were appropriate
and useful. It is important to realize, however, that the eventual user
is not limited to only these. Not only is he free to design his own
interpreters, but RLL aids in this mission, by providing tools which
facilitate constructing such interpreters. Indeed, as these tools are
represented in RLL as well, the user can even construct his own building
aids, if he feels the need.
There are similar mechanisms for implementing the processes used for
"running" agendae, or tasks, or other control regimes. (For example a
black board architecture, for other types of tasks.) Corresponding to
each such process is the data structure to which it applies; and RLL
includes means for extending or adapting those as well -- ways of
extending tasks, for instance.
This section tried to demonstrate how useful RLL's competence in this
programming domain is when designing and building a new expert system.
The next subsection shows another application of this facility -- when the
user wishes to modify his existing ES. Here the user has the option of
changing his mind later, and know that RLL will modify not only the code,
but perform the retroactive updates to his data required to keep code and
data in synch.
2.4.1.2.2 RLL's Modifiability
Different system have different conventions defining what part of the
running system is changable, and which is, to all but the most
sophisticated user, untouchable. By design even casual users can enter,
change or delete "domain data" -- facts, in this case, about the
connectivity of pipes, or the precise placements of various buildings.
(Eg if the user later realizes that Pipe#406 really joined Pipe#317,
rather than Pipe#316 as he had thought.)
In many representations systems, the user can addres and modify generic
objects as well -- for example, respecify which features are well-defined
for types of roads, or redraw the taxonomical information about types of
oil. In addition to these, some languages permit the user to define
totally new types of relations -- in the form of new slots for our RLL
language. (For example, the FeedsInto slot which was handily defined as
the Composition of the OtherEnd and the ConnectedPipes slots.)
These definitions can be used to redefine the "meaning" of a slot as well
as initiate one. Suppose you realized that pipes could really join many
pipes, rather than just one. This would be difficult to say in many
languages -- indeed the only solution in most might require throwing away
that FeedsInto link and defining a totally new FeedsIntoS link. Even then
all of the existing data would have to be transfered, regrettably by hand.
In RLL, on the other hand, all of the executable code is made explicit,
and the user is permitted to modify it to suit his design. All he need do
is edit the FeedsInto unit, examine the Datatype slot, notice it says
SingleElement, and change that to read SetOfElements. As he exits the
editor, RLL makes the necessary changes and all units' FeedsInto slots
will look like singleton sets, and can have multiple entries added on at
any time.
As we earlier pointed out, RLL provides various tools for such adaptations
-- for example, the high level specification language for functions, and
the use of appropriately chunking the knowledge. (These are the same
things which proved so useful for constructing of parts -- but it is also
be exploited for retro-actively modifying existing parts.) RLL's uniform
representation of facts makes changing the code as easy as altering any
domain data -- as the facts which specifies a slot's arity is in the same
format as data about Oil#31, or any other domain or representation fact.
Furthermore, RLL's "understanding" of things like SingleElement and
SetOfElements allows it to do all the appropriate fixes, automatically.
RLL extends this modifiability to include the control structure as well.
This permits the user to construct his own form of control, as he needs
it.
The next section outlines the contol structure we found useful for this
particular Oil Spill task. We wish to emphasize, again, that this Agenda
mechanism is but one of several control structures, some implemented and
some merely envisioned -- and that the user could have designed his own
from smaller pieces supplied by RLL, using tools present in RLL.
2.4.1.3 Control
The characteristics of the user's current domain task will strongly
influence the choice of which control regime to select, what inference
algorithms will be active, what slots will be employed, etc. For example,
the complete OSS is expected to perform a variety of different tasks --
from determining the type of the spill to performing various remediation
tasks, such as notifying the authorities. Another important
characteristic of this job is the timeliness and ordering of these tasks
-- the importance of recording the occupation of first witness dwindles
when compared with telling him not to breathe the toxic fumes. Hence
determining the toxicity of the vapors should be of paramount importance,
especially when the leak might be near people. A well structured agenda
seemed ideally suited to these specifications. The agenda has tasks of
varying levels of generality, each justified by some symbolic reasons
which yield a numeric priority value for the task. Some tasks, such as
those involving human lives, will have much higher priority than tasks
whose reasons are "long-term demographic data". As a task executes, it
may add new tasks to the agenda, occasionally supplanting what would have
been the next task to execute.
This Control subsection has three other parts. The first is simply a
quick behaviour sketch of this overall mechanism. The second part goes
into more detail, describing how these chunks of code were generated,
encoded, and processed. These descriptions seem to imply the RLL must,
necessarily, be hideously slow. This is NOT the case. The third part
specifically addresses the issues of speed and efficiency -- a
sufficiently important, and misunderstood, point that it deserved to be
included in this description of RLL. Almost all of the lost efficiency
can be recaptured.
2.4.1.3.1 Overview of Agenda Mechanism
The agenda contains an unordered set of tasks, which are processed one at
a time. Tasks are chosen based on the priorities of the reasons
supporting them. Each task is designed to performs a particular bitesize
job, to meet a local goal. For example, the goal of one task may be to
determine the values of some parameters, (e.g. the task MatType attempted
to deduce the type of material which had spilt,) another may add other
tasks to this agenda, (for example, the task in charge of effecting the
Countermeasures added several new tasks to the agenda - one to notify the
authorities, another to begin the clean up process, etc.) and a third
type would print out instructions/requests to the user (eg "go to Manhole
#34 and tell me if you see oil"). There were several other (pre-existing)
kinds of tasks which were not necessary for this particular domain -- for
example, one type could suggest and create a new concept.
In general, executing a task involves first collecting relevant rules,
then ordering and firing this list in sequence. (For this limited oil
spill project, this collection step was trivial -- it was simply looking
up the set of rules pre-stored on each particular task.) There are many
types of rules, each charged with (attempting to) achieve a different type
of goal. (In this sense they are similar to tasks, only in a smaller
scale.) Some rules employ a known technique (algorithm) to evaluate the
value of some attribute. (For example, determining the toxicity of a
given material by looking it up in a table.) Others may decide to add
additional rules to this rule-set, or propose adding a particular new task
to the agenda. Still others may suggest that the current task (in which
this rules occurs) be suspended, to await necessary new data. (Note the
flow of command is quite structured. Each rule, task or agenda* can only
effect its immediate surroundings. More global changes are performing to
sending messages to its overlord, proposing this alteration.)
* Footnote: RLL (EURISKO) is capable of dealing with several agendae --
swapping among them as need arises. This facility was one of many RLL
features which were not needed for this job.
2.4.1.3.2 The Details of the Agenda Mechanism
We chose to view each part of the overall control as a process --
each operation took some prescribed type of input, and performed
some action, possibly returning a result.
As sketched in the previous subsection, the "Agenda-Processor" looked like this:
Initialize <Agenda>
Loop thru <Tasks>:
Get next <Task>, T
Process T
PostMortem <Agenda>.
The "Task-Processor" is (not surprisingly) similar:
Initialize <Task>
Loop thru <Rules>:
Get next <Rule>, R
Process R
PostMortem <Rule>.
as is the overall OSS Process -
Initialize <OSS-System>
Loop thru <Agenda>:
Get next <Agenda>, A
Process A
PostMortem <OSS-System>.
This commonality was factored out, and exploited by defining the
EuriskoProcess * unit, shown below, as a common "ancestor" to both
task-processors and agenda-processors (as well as other types and levels
of control structure - such as the overall top-level systems for both
Eurisko and this OSS projects).
[*fnnote: RLL was developed as the underpinnings for the EURISKO project,
described in [Lenat The Nature of Heuristics]. EURISKO relies on a
sophisticated, modifiable agenda mechanism, which is why it was the first
control structure developed and incorporated into RLL.]
-----
AnyEuriskoProcess
Isa: (AnyClassOfObjects)
SuperClass: (AnyProcess)
SubClass: (AnyTaskProcess, AnyAgendaProcess, ...)
Examples: (FullEURISKO-System, FullOSS-System ...)
TypicalExample: TypicalEuriskoProcess
...
TypicalEuriskoProcess
TypicalExampleOf: AnyEuriskoProcess
FunctionalSlots: (LispFn ToInitialize ToSelectNextSub ToXeqSub
ToPostMortem)
...
-----
Defining the task processor is now fairly straightforward -- first create
the OSS-TaskProcessor unit as an example of AnyTaskProcess, and then fill
in the values of its functional slots: ToInitialize, ToSelectNextSub,
ToXeqSub, and ToPostMortem. (As we'll see in a moment, the value of
LispFn is generated from these components. As such the user is NOT
expected to enter this value.) Actually the user's chore is further
simplified as any of these slots he leaves unspecified defaults to the
value of the corresponding slot of TypicalTaskProcess. *
[*fnnote: This only works because the various involved slots are all
"inheritable slots" -- ie their respective values can be determined by
following the inheritance paths; and because the first prototype these
tasks will find is TypicalTaskProcess.]
-----
AnyTaskProcess
Isa: (AnyClassOfObjects)
SuperClass: (AnyEuriskoProcess)
SubClass: ()
Examples: (OSS-Task-Process)
TypicalExample: TypicalTaskProcess
...
TypicalTaskProcess
TypicalExampleOf: AnyTaskProcess
ToInitialize: DefaultTaskInitializer
ToSelectNextSub: DefaultRuleSelector
ToXeqSub: DefaultRuleProcessor
ToPostMortem: DefaultTaskPostMortemor
...
-----
These "functions", DefaultTaskInitializer, DefaultRuleSelector, etc., are
themselves units, which the user can examine and modify, if desired.
The "LispFn" of this OSS-TaskProcessor is now filled in, by connecting the
code present (or virtually present) in the various functional slots of the
OSS-TaskProcessor unit, to correspond to the schema shown in above.
But so far, no one (or rather, no task) knows about this nice assembled
clump of code. To tie it in, each task which wants to use it must point
to this OSS-TaskProcessor. Setting the value of Task#34:ToProcessMe to
OSS-TaskProcessor achieves ths effect. In fact, as the value of this
ToProcessMe slot is inheritable, setting the value of
TypicalOSSTask:ToProcessMe to this OSS-TaskProcessor means everything
which includes TypicalOSSTask as a prototype will use OSS-TaskProcessor by
default -- unless the user, wanting a different task processor to be used
for this particular task, explicitly stored the name of that processor on
this slot.
Creating the OSS-AgendaProcessor is quite similar, here exploiting facts
stored on TypicalAgendaProcessor as needed. Note that it is the Agenda
controller which decides what to do with each task. By default, each
agenda "processes" each of its tasks, using the mechanism which looks on
that task's ToProcessMe slot, and calls that function on this task. Of
course, the agenda is not forced to do this -- it may "know better", and
use some other task-processor -- for example, on which is less accurate
but quicker, for all the tasks. Or it may only use this power sometimes,
for example, if the agenda processor is told that storage space is low, or
whatever.
2.4.1.3.3 Efficiency
From the above descriptions, one may infer that RLL is actually
"executing" high level specifications of various control algorithms that
the user input. Were this true, RLL would have sacrificed any hope for
speed and (time-)efficiency for the sake of flexibility. That is NOT the
case. RLL preserves both high level descriptions AND what these forms
"compile" into -- and it is that latter body of fast code which is
actually run. Much of RLL's boot-strapping code is devoted to maintaining
these correspondences, so that any change to a high level form marks the
(formerly-)corresponding executable code as invalid. On demand (that is,
the next time that code is needed,) RLL will expand that new high level
definition into runnable code; and store these low level forms so they
will accessed speedily next time the code is needed.
RLL's code can be arbitrarily fast. Rather than "interpret" the high
level descriptions, RLL first "compiles" these into efficient forms, and
runs this faster code. This costs only a constant overhead -- for the one
time charge of expanding that terse description. From there on the code
can be arbitrarily efficient. The first time a GET is done, or the first
time the value of a high level slot is called for after its definition has
been written or changed, there will be a pause; but the next (and future)
times there will be no noticable slowdown compared with hand-crafted Lisp
code.
Basically RLL has paid for its flexibility in terms of space -- to
maintain the different versions of the code -- and not in time. More
details on how these multiple versions are maintained, used and cached can
be found in [Lenat et al] and [Greiner].
2.4.1.4 System Facilities
STRENGTHS
This section, so far, has emphasized RLL's strengths which derive from
it's "competence model" of programming in general, and of the data
structures and algorithms it uses in particular. Many of its other wins
are due to its environment. Because it is is built on top of CORLL, a
demand paging system [CORLL], RLL is able to ignore InterLisp's 256K
storage maximum, and store an almost unlimited number of units. Other big
advantages come from InterLisp itself -- including things like spelling
correction and inline editors.
Having all knowledge represented uniformly should be a powerful boost to
attempts at meta-cognitive systems. RLL needs this meta-level reasoning
ability, since it must monitor and model and modify itself. Furthermore,
it is possible to write new slots in terms of old ones, slots which skew
the system to be more appropriate to CAI than to performance on a task.
This may aid in realizing multiple uses from a single knowledge base.
WEAKNESSES
RLL's biggest problem, at this stage of its development, is its
immaturity. By design, it will continuously incorporate new facts about
control structures, forms of representation and modes of inheritance.
However, OSS was built way too early in this ongoing design process. For
example, RLL had never encountered a Solvable Problem before -- all of its
previous work had been directed towards AM-like searches -- explorations
which never terminate with a final answer (ie RLL knew about tasks like
"Find examples of primes", but not about things like "What is the source
of the spill").
Another big weakness, which also stemmed from its youth, was its lack of a
user front end. A trivial example involves the way we were forced to
enter the code -- everything was in a LISPy (GetValue 'Material 'Name),
rather than a nicer "Material's name". At a deeper level, RLL did not yet
have any notion of context -- this forced us designers to explicitly type
(continuing the above example) "Material name" rather than the
(sufficient) "name". A good front end could have aided in the design of
this system, by suggesting appropriate data structures and algorithms as
we went. This too was missing.
A smaller shortcoming, again ephemeral, is RLL's dependency on InterLisp.
Our long term plans are to build RLL into a module which can be run in MRS
[MRS]. RLL will, at that point, inherit the machine and LISP-dialect
independence of MRS.
RLL's final major fault seems almost paradoxical: it was it's immense
adaptability. Too much freedom, we found, forced us to spend considerable
time deciding which of many plausible options really was the best. This
time, of course, was not ill-spent; the final OSS is certainly better than
the product of any system which forced a choice at each step, molding our
final code to match the ESBp's limited set of usable components. However,
as such external constaints would certainly have narrowed down our search,
we might even have finished the system in the four days we were allotted.
We believe this type of problem could have been alleviated in a more
mature system, which included a competent front end which would have
served as "gentle guide" during this building process.
2.4.1.5 Conclusion
We will conclude win a perspective of RLL -- in terms of its past, its
current status, and our long term plans.
Two years ago, RLL began as a two week project to build a representation
language on which the EURISKO system would be built. We soon realized
many non-trivial research issues had to be addressed before such a
general, "self-encoding" language could be constructed. We eventually
abandoned the quest for this ultimate language, satisfying ourselves with
the generality which was possible within a limited set of constraints.
The more theoritical aspects of this representation language language work
was taken up by other members of the HPP community; and their next step is
described in [MRS].
The current system plays a necessary boot-strapping role -- certain parts
of the system must be present, as these are used to fill in other parts.
For example, the code which is used to expand high level definitions must
be present -- or at least enough of it to generate the rest of the itself.
We are expanding the number and size of the knowledge bases RLL contains,
and as the magnitude and variety expand, reasoning by analogy becomes
increasingly feasible. Important in this list of knowledge bases is the
elusive "expertise about expertise." It is this knowledge which enables
RLL to fashion new ESs with greater efficiency and accuracy. The EURISKO
system, by Lenat, contains knowledge bases about elementary mathemtics,
games, heuristics, representation, and Lisp programming. There was no
extra work needed to add "heuristics" to that list, since each heuristic
was already represented the same way as, e.g., each mathematical function.
Thus the heuristics were already capable of working on each other. For
instance, a heuristic rule that said "If an operation is sometimes useful
but very timeconsuming, Then specialize it" could apply to
FunctionalComposition, but also to heuristics -- in fact, in one run of
Eurisko, it even applied to itself. Lisp programming also required no
real change in the RLL system, as each Lisp function has a unit to
represent it. While reasoning about Lisp predicates, Eurisko noticed that
EQ appeared to be a restriction of EQUAL, disjoined an EQ test onto the
front of EQUAL, to see if ran faster, and lo and behold it did! This
minor glitch in InterlispD has since been repaired. In the field of
Games, Eurisko applied the same set of heuristics as used in it other
domains to the task of designing a naval fleet of ships to be used in
wargames, and the fleet it designed won first place in the national
Traveller TCS fleet battle tournament on July 3-5, 1981. The important
idea in all this is that self-descrription and uniformity are powerful --
so powerful that they more than repay the initial cost of carefully
describing every piece of the system to itself. Having knowledge which
can be used across domains makes building expert systems easier and
easier, as that common KB grows.
While other ESBp systems have proven expedient for tackling particular
problems in particular domains, such systems seem inescapably limited in
the scope of problems they can solve. We feel this problem is intrinsic
to such ESBps, at least until they, like the human programmers they are
trying to emulate, are capable of a crude understanding of the code they
are attempting to to compose -- an understanding which is far deeper than
the superficial level any current system (including RLL) has yet achieved.
This work grew out of earlier research on automatic programming [Green et
al, 1975], and AP still seves as the "engine" that converts
self-description into self-modification. RLL was written with the goal of
attaining the necessary level undertanding to carry out such a massive
automatic programming task.
BIBLIOGRAPHY
[Green et al 75] Progress Report on Program Understanding Systems, SAIL Memo
[Greiner80] - "RLL-1: ..."
[Greiner & Lenat80a] - AAAI
[Greiner & Lenat80b] - Details of RLL
[Lenat 81] The Nature of Heuristics, HPP Memo and Xerox Report
[Lenat, Hayes-Roth, & Waterman] - Cognitive Economy
[Smith] CORLL
[Genesereth, Greiner & Smith] - MRS